Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1384 by refactoring VersionMap to include all combinations of LoadTypes and whether to include_deleted #1443

Merged
merged 11 commits into from
Jul 24, 2024

Conversation

IvoDD
Copy link
Collaborator

@IvoDD IvoDD commented Mar 20, 2024

Reference Issues/PRs

Fixes #1384

What does this implement or fix?

The #1384 bug also applies to as_of=<VERSION_ID> it is just not seen in the repro because it is not looking for an old enough version.

The bug is that we perform the check on TOMBSTONE_ALL only for LOAD_UNDELETED and LOAD_LATEST_UNDELETED. However LOAD_DOWNTO and LOAD_FROM_TIME should also only work on undeleted versions when querying with as_of.

As discussed we would like to have the ability to both load deleted and undeleted with LOAD_DOWNTO and LOAD_FROM_TIME. This essentially leaves us with the full cartesian product of possible load types and whether to include deleted or not.

This the majority of this change is to do the refactor to extract the include_deleted out of the load type and ensure correctness.

The PR is split in several commits, each one of which should be reviewed separately. The commits have detailed descriptions but in a few words the commits do the following:

  • Introduce a failing cpp test to demonstrate Poor performance when reading as_of a date with many early versions deleted #1384
  • Moves entry updates when tombstoning inside write_tombstone
  • Revamp version map caching logic to simplify the checks and to make less unnecessary cache misses
  • Fix cache invalidation when tombstoning
  • Refactor the LoadType into a LoadStrategy which encapsulates the LoadType, include_deleted and the timestamp or version for LOAD_DOWNTO and LOAD_FROM_TIME. This fixes the Poor performance when reading as_of a date with many early versions deleted #1384 bug and provides new functionality
  • Small refactor with renames to make the logic inside follow_version_chain clearer.
  • Decrease the default unsync_tolerance because previously we were ~always skipping the cache and deeming it outdated. Also fixes a bug where we don't clear the cache when clearing the storage.

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch 2 times, most recently from ae70971 to bf97d88 Compare March 26, 2024 12:48
@IvoDD IvoDD changed the title Fix #1384 by stopping to follow version chain on tombstone all for LOAD_TIME_FROM and LOAD_DOWNTO Fix #1384 by refactoring VersionMap to include all combinations of LoadTypes and whether to include_deleted Mar 26, 2024
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch from 9a3b148 to f2cc889 Compare March 26, 2024 13:38
@IvoDD IvoDD added bug Something isn't working enhancement New feature or request and removed bug Something isn't working labels Mar 26, 2024
@IvoDD IvoDD self-assigned this Mar 26, 2024
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch 3 times, most recently from 5636473 to c0e5f35 Compare March 27, 2024 15:30
@IvoDD IvoDD marked this pull request as ready for review March 27, 2024 15:46
cpp/arcticdb/version/version_map_entry.hpp Show resolved Hide resolved
cpp/arcticdb/version/version_map_entry.hpp Outdated Show resolved Hide resolved
cpp/arcticdb/version/version_map_entry.hpp Outdated Show resolved Hide resolved
cpp/arcticdb/version/version_map_entry.hpp Outdated Show resolved Hide resolved
cpp/arcticdb/version/version_map.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@poodlewars poodlewars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've focused on the tests in this review as AlexO has done the rest.

@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch 5 times, most recently from 7ab9631 to 559f203 Compare April 18, 2024 11:33
@poodlewars poodlewars self-requested a review April 23, 2024 08:29
Copy link
Collaborator

@poodlewars poodlewars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this other than the remark about the tombstone being added twice

return cached_timestamp <= requested_load_strategy.load_from_time_.value();
}
case LoadType::LOAD_ALL:
// All versions are cached if the last time we loaded the cache we used a LOAD_ALL strategy.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is too long, how about "Can only use cache when it was populated by a LOAD_ALL call, in which case it is only unsafe to use when the cache is of undeleted versions and the request is for all versions".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment as suggested

cpp/arcticdb/version/version_map_entry.hpp Show resolved Hide resolved
@@ -202,13 +201,12 @@ inline version_store::TombstoneVersionResult tombstone_version(
stream_id, version_id);
}
// We will write a tombstone key even when the index_key is not found
tombstone = version_map->write_tombstone(store, version_id, stream_id, entry, creation_ts);
version_map->write_tombstone(store, version_id, stream_id, entry, creation_ts);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a call to write_tombstone in delete_version_if_undeleted which I think needs to be updated. As things stand, it will be adding the tombstone twice, I think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed - the function that needs changing is actually in the enterprise repo, which I had checked out while doing this review. So this is fine - we just need to tweak the enterprise repo after this change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@poodlewars poodlewars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but please wait for AlexO and William to have a look too before merging

@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch from 559f203 to 72365e1 Compare April 24, 2024 14:56
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch from 8c82d76 to c059481 Compare May 2, 2024 06:57
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch from c059481 to d2b86ac Compare May 3, 2024 07:37
python/arcticdb/version_store/_store.py Outdated Show resolved Hide resolved
python/arcticdb/version_store/_store.py Outdated Show resolved Hide resolved
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch 3 times, most recently from 1cdb249 to 867d3be Compare May 23, 2024 08:03
stream_version_data.react(query_2);
ASSERT_EQ(stream_version_data.count_, 2);
ASSERT_EQ(stream_version_data.load_param_.load_type_, LoadType::LOAD_DOWNTO);
ASSERT_EQ(stream_version_data.load_param_.load_until_version_, 4);
ASSERT_EQ(stream_version_data.load_strategy_.load_type_, LoadType::LOAD_DOWNTO);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we lose the extra LOAD? as in LoadType::LATEST not LoadType::LOAD_LATEST? It's a bit smurf naming at the moment (not your fault) https://devcards.io/smurf-naming-convention

@@ -49,7 +49,7 @@ std::vector<AtomKey> backwards_compat_write_and_prune_previous(std::shared_ptr<S
log::version().debug("Version map pruning previous versions for stream {}", key.id());

std::vector<AtomKey> output;
auto entry = version_map->check_reload(store, key.id(), LoadParameter{LoadType::LOAD_ALL}, __FUNCTION__);
auto entry = version_map->check_reload(store, key.id(), LoadStrategy{LoadType::LOAD_ALL, LoadObjective::ANY}, __FUNCTION__);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find a LoadObjective of ANY a bit confusing. Any what?

if (entry.loaded_until_ <= static_cast<VersionId>(load_param.load_until_version_.value())) {
bool loaded_as_far_as_version_id(const VersionMapEntry& entry, SignedVersionId requested_version_id) const {
if (requested_version_id >= 0) {
if (entry.loaded_with_progress_.oldest_loaded_index_version_ <= static_cast<VersionId>(requested_version_id)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entry.load_progress_?

@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch 2 times, most recently from 661ef67 to a55eba5 Compare July 12, 2024 08:08
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch from a55eba5 to 8f59430 Compare July 23, 2024 11:33
IvoDD added 11 commits July 23, 2024 17:49
- We fail to exit early when following the version chain with both
  LOAD_DOWNTO and LOAD_FROM_TIME. Adds a cpp test to show that
- Fix will be done in follow up commits
- Before this change we had to remember to update the version_map entry
  every time after calling write_tombstone. Now write_tombstone does it
internally.
- This is a small change which should not affect behavior.
- Tests are not updated regarding this change because they will be
  revamped in following commmits.
- Unify loaded_until_ and oldest_loaded_index_version_ as they should
  always have the same version (because version chain is sorted)
- Store a LoadProgress inside VersionMapEntry instead of a loaded_until
- The LoadProgress has information about earliest loaded version and
  earliest loaded timestamp. This allows us to resolve the TODOs inside
the has_cached_entry checks.
- Now we can always determine whether LOAD_FROM_TIME needs to read from
  cache by comparing with the LoadProgress timestamp.
- Revamps CacheInvalidation cpp tests to cover the new functionality
- Add an earliest_loaded_undeleted_timestamp to make follow up refactor
  easier.
- Makes the refactor in follow up commit easier to reason about.
- We keep an oldest_loaded_undeleted_index_version in the LoadProgress
- We use it so that if we tombstone that version (either by a tombstone
  or tombstone_all) we invalidate the cache for undeleted versions
Introduces a new type LoadStrategy which has the load_type, the
to_load enum, which indicates whether to load ANY or only UNDELETED
versions and the version/time we need to search for.

With this commit we mostly preserve existing behavior with a few
exceptions:
- The newly introduced deleted/undeleted variants of LOAD_DOWNTO and LOAD_FROM_TIME are introduced and are handled appropriately and their behavior is documented in LoadStrategy.
- Abstract away the batch version map merging of LoadStrategies and fix
  a small bug there.
- Fixes #1386 by passing include_deleted=false when looking for as_of
  time or version

Also adds an elaborate FollowingVersionChain and
FollowingVersionChainWithCaching cpp tests
Before this change version_map's reload_interval was equal to the
unsync_tolerance. This means that we would ~always miss the cache.

The check for outdated cache is as follows:
- When reading into cache set `last_reload_time = now -
  unsync_tolerance`
- When checking if cache is outdated we check whether `now -
  last_reload_time > reload_interval

This change updates the unsync tolerance to 200ms. This will mean the
cache is valid for 1.8s

Tests uncovered that we were not flushing the version_map when clearing
the storage. This change fixes this as well.
This commit is only a rename
`iterate_on_failure` used to provide a back up way to load versions if
the ref key is unreadable for some reason. The ref key should no longer
be unreadable, thus `iterate_on_failure` should be removed.

This commit removes the `iterate_on_failure`
- argument in the `lib._nvs` functions
- parameter of `VersionQuery`
- parameter of `LoadParameter` and entirely removes the `LoadParameter`
  and replaces it with `LoadStrategy`
- removes IterateOnFailure cpp test which used to test this behavior by
  artificially removing the ref key.
The lib._nvs api needs to be stable so we re-add the iterate_on_failure
argument to not break existing usage.

Instead whenever the argument is passed we issue a warning that it's
deprecated.
This commit is just a rename addressing some code review comments.
@IvoDD IvoDD force-pushed the fix-early-end-on-tombstone-all branch from 8f59430 to 272c32e Compare July 23, 2024 14:51
@IvoDD IvoDD merged commit df0082d into master Jul 24, 2024
113 of 114 checks passed
@IvoDD IvoDD deleted the fix-early-end-on-tombstone-all branch July 24, 2024 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor performance when reading as_of a date with many early versions deleted
4 participants